System and method for predicting security price movements using financial news

ABSTRACT

A method of creating a price prediction model that forecasts short-term price fluctuations in financial instruments by collecting, analyzing and classifying financial news for a financial instrument into categories. Distributions for the changes in price of the financial instrument for a set period of time and distributions for the changes in price of the financial instrument as a result of the financial news for each news category for a set period of time are then obtained. If the distributions for the changes in price of the financial instrument are statistically significantly different than the distributions for the changes in price of the financial instrument for a particular news category, and the mean for the change in price is greater or less than zero, a signal is produced indicating the trading action that should be taken for the financial instrument.

PRIORITY

This application is a continuation of and claims priority of U.S.application Ser. No. 10/113,895 filed Mar. 28, 2002 which claimspriority to U.S. provisional application 60/350,264 filed on Jan. 18,2002.

BACKGROUND OF THE INVENTION

A. Field of the Invention

A portion of the disclosure of this patent document contains materialthat is subject to copyright protection. The copyright owner has noobjection to the facsimile reproduction by anyone of the patent documentor the patent disclosure, As it appears in the Patent and TrademarkOffice patent file or records, but otherwise reserves all copyrightrights whatsoever.

Background and Prior Art

One “Holy Grail” in the financial markets is the development of anautomated system that predicts price movements of financial instruments.If one is able to predict whether prices were moving up or down forfinancial instruments such as stocks, bonds, and commodities, then, onewould have a way to generate money. Several prediction strategies existthat find patterns in price fluctuations. They fall into two categories:fundamental analysis and technical analysis. Fundamental analysis isperformed by an analyst that keeps abreast of the news and dataaffecting a specific stock or market. The successful analyst warehousescorrelation in the market and predicts the correct trend. This type ofanalysis often involves a prediction with a long-term horizon, such as afew months or years. Technical analysis is performed by a person ormachine that looks for numeric trends in changes in financial andeconomic measures. Technical analysis is often used for short-term andlong-term trading. The following invention is a fusion of Fundamentaland Technical analysis. The invention predicts the movement of afinancial instrument given historical closing prices and daily financialnews about the underlying financial instrument.

The Engineering and Economic research literature is replete withapproaches that use historical stock prices and economic values forpredicting when to purchase a stock. For example, Yoon and Swales used afour-layered neural network to determine well performing firms andpoorly performing firms using nine economic measures as input [1].However, these approaches, whether they use neural networks orstatistical regression, do not incorporate the events, and inparticular, the news events that are responsible for the actualday-to-day price movements.

Economic news event studies have motivated several research projects. Atypical event study would determine if a correlation exists betweenprice changes and a particular event such as a stock splits, mergerannouncements, or the reporting of earnings. The example on page A-5 inthis document contains an example using merger announcements. Somerelated research have used proxies for more general classifications ofnews. For example, Depken [4] uses a decomposition of volume as a proxyfor “Good” and “Bad” to study how split-stocks react to news. In thiswork and others, the measure of interest is the statistical variance ofvolume and price changes. However, it is not clear that event studiesusing variance or volatility as the measure of interest have predictivevalue. Volatility can be defined as the standard deviation (square ofthe variance) of the annual expected return of a security. Bydefinition, volatility does not predict the direction of pricemovements, only a dispersion of possible annual returns, both negativeand positive.

Upon close examination of the Economic event study literature, it isevident that prediction is not the purpose of the research. Themotivation of this research is to find and explain a market behavior inthe context of a correlation between specific events and price changes,thus much of the research does not provide results for prediction, orrecommend how the techniques described could be used in a predictionprocess. See Chan [3] for a comprehensive summary of previous relatedresearch for Economic event studies.

There is some recent research from the Machine Learning and InformationRetrieval literature that is concerned with prediction. This researchattempts to find a correlation with the words in the news that co-occurwith surprising price changes. For example, Fawcett and Provost [5] finda set of words that often occur with 10% price changes in a stock. Thistype of text retrieval process shares a similarity to the inventiondescribed here, because it is extensible to events in general and notspecific to predefined events. However, in this type of research thewords predict when a particular price change event will occur, and thereis no attempt to use an analyst's classification of “news” as input.

SUMMARY OF THE INVENTION

This SYSTEM AND METHOD FOR PREDICTING SECURITY PRICE MOVEMENTS USINGFINANCIAL NEWS forecasts short-term price fluctuations in domestic orinternational stocks. However, the present invention may be utilized forany financial instrument and the embodiment of this approach is notlimited to applications in the stock market.

In one specific embodiment of the approach, textual financial documentsobtained from public interest web sites were reviewed by financialanalysts and classified to be either “good news” or “bad news” relativeto the expected performance of a financial instrument. In addition,“mixed news” and “mention news” were used as classifications forfinancial news. Distributions of price changes for a particularfinancial instrument were sampled from the data based on the occurrencesof the different classification of news. In this embodiment of theapproach, the distributions were used to form a model that produces buy,sell, and no-trade signals for the financial instrument. The model isthen used to predict when to buy, sell or not trade the stock given thedaily occurrences of the underlying company's financial news.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A: Historical News and Price Classification Building Process.

FIG. 1B: Apparatus used by an analyst to visualize news articles andassociated news classifications.

FIG. 1C: Apparatus used to gather a classification for a news article.

FIG. 2A: Prediction Model and 2-day price change distribution for astock and for four classes of news.

FIG. 2B: Prediction Model and 2-day price change distributions for astock and for days when good news appears.

FIG. 2C: Prediction Model and 2-day price change distributions for astock and for days when bad news appears.

FIG. 2D: Prediction Model and 2-day price change distributions for astock and for days when mixed news appears.

FIG. 2E: Prediction Model and 2-day price change distributions for astock and for days when mention news appears.

FIG. 3: Prediction Model and 1-day price change distributions for astock and for days when good news appears.

FIG. 4A: Daily Trade Decision Process for a Stock.

FIG. 4B: Apparatus used to predict price direction given predictionmodels.

FIG. 5: Sample Return Effectiveness based on 16 Stocks.

DETAILED DESCRIPTION OF THE INVENTION

The present invention, described herein, is for predicting short-termprice fluctuations in domestic or international stocks. However, thepresent invention can be utilized for any financial instrument;therefore, it should be understood that the embodiment of this approachis not limited to applications in the stock market.

The salient distinction between this invention and previous approachesis the novel use of news as the input to the price prediction model. Inembodiments of this invention, an analyst classifies or judges financialnews articles using the following four classes or categories:

GOOD—good news, an event that improves the fundamental outlook of thecompany (ex: ‘results of a study that proved the high effectiveness ofJNJ's coated stents, and cited it as likely the first to receivegovernment approval’), better than expected earnings, a new contract,the expectation of new business, the acquiring of key personnel, etc.

BAD—bad news, something financially detrimental to the company or itsindustry, events such as extremely large litigation settlements,pipeline shutdowns due to indeterminately long political turmoil,unexpected poor earnings, loss of key clients, loss of key personnel,announcement of bankruptcy, unusual insider selling, etc.

MIXED—mixed news, some good and some bad news mixed in the same story,article not specifying why the price movement was contrary to what thefundamentals indicated (ex: while the earnings were bad year over year,they were better than consensus), bad earnings with expectation of goodearnings growth, layoffs implying improved bottom line, loss of businessand gain of new business, etc.

MENTION—mention news, the company's name is mentioned in an article inpassing, (ex: ‘JNJ is the second largest pharmaceutical company, behindMRK’), a fundamental change in a company that was announced weeks ago,etc.

The judgements for stories are used for two purposes: 1) to build aprice prediction model (see FIGS. 1A-C), and 2) to be used as input fora daily price prediction process for making actual trades (see FIGS.4A-B). Cleary, the analyst's judgements are subjective, but it isassumed that the analyst is an expert and has experience in thefinancial markets, and has some specification for the guidelines of thedifferent categories. The above set of classes or categories would beuseful for stocks.

In one embodiment of the invention, analysts classified financial newsstories that were available on the internet from various news feeds. Thestories and articles were from the Associated Press and Reuter'sfinancial news wire about publicly traded companies. For the purpose ofthis embodiment, a total of three analysts were used with Mastersdegrees in Business Administration, and backgrounds comprising severalyears of financial markets experience. They were given guidelinessimilar to those listed above. In this embodiment, classification wasbased on the impact of the event on the financial outlook of thecompany, and not whether the stock price would go up or down.

A price prediction model for a stock is determined using historicalclosing prices and a set of financial news judgements for the articlesabout the stock. The approach is illustrated in FIG. 1A. The first stepis to use historical daily closing prices for the stock and determine amean, μstock, and standard deviation, σ_(stock), for the change in pricefor a stock. The change in price for a stock between times t_(i) andt_(j) is: (closing_price(t_(j))—closing_price(t_(i)))/closing_price(t_(i)). During the training period for thestock's price prediction model, distributions are gathered wheret_(j)-t_(i) are 1 and 2 business days apart. The distribution of pricechanges is assumed to be approximately normal (bell shaped curve), andthe distribution is represent as ˜N(μ_(stock), σ_(stock)), or ˜N(μ, σ)stock as a shorthand.

For example, assume we have a stock with the following data:

Stock: XMPL Closing 1-Day Change 2-Day Change Date Price News Class inPrice (%) in Price (%) Jan. 2, 2002 1.00 GOOD, 1 article Jan. 3, 20021.50 BAD, 1 article 0.50 Jan. 4, 2002 1.25 No News −0.17 0.25 Jan. 5,2002 2.00 No News 0.60 0.33

EXAMPLE 1

The training period is Jan. 2, 2002-Jan. 5, 2000.

The distribution of the 1-day change in price of the stock in generalis:

t₁=0.5, t₂=−0.17, and t₃=0.6.

The distribution of the 2-day change in price of the stock in generalis:

t₂=−0.25, and t₃=0.33.

Incorporated herein is references to A-1 to A-3 of the Appendix, whichprovide a description and equations for calculating the mean thestandard deviation of a distribution.

The apparatus for collecting analyst classifications via a website isillustrated in FIGS. 1B-C. A listing of news article titles for acompany is displayed on the computer screen. In addition, each articlehas a graphic indicating the classification of the article, or a graphicindicating that the article needs to be classified. In one embodiment ofthe experiment (see FIG. 1B), an up arrow in a green box indicated thearticle was classified as good news, a down arrow in a red box indicatedbad news. An up and down arrow in a yellow box indicated mixed news, ahorizontal line in a gray box indicated mention news. If a J appeared inthe box, the analyst clicked on the box and would enter the informationrequired to maintain the classification for the stock's news over time.The apparatus in FIG. 1C is used to collect the classification for eacharticle of news. The stock's ticker, the date/time of the article, thelocation of the article, and the analyst's classification are entered.When done, the analyst clicks the ‘Submit Judgement’ button on thegraphic user interface. The classifications are used with daily pricechanges to build the price prediction model for a stock.

Price change distributions for the days when news appears are determinedfor each class or category of news. For example, if at t₀, there existedan article assessed as “good news”, the price change between t₀ and t₁becomes a member of the distribution for good news, which is assumed tobe approximately normal and represented as ˜N(μ_(good), σ_(good)). Inaddition, distributions ˜N(μ_(bad), σ_(bad)), ˜N(μ_(mixed), σ_(mixed)),and ˜N(μ_(mention), σ_(mention)) are also determined for days where bad,mixed, and mention news appear in the news.

Referring to example 1 above:

The distribution of the 1-day change in price of the stock when goodnews appears is:

t₁=0.5

The distribution of the 2-day change in price of the stock when goodnews appears is:

t₂=−0.25

The distribution of the 1-day change in price of the stock when bad newsappears is:

t₂=−0.17

The distribution of the 2-day change in price of the stock when bad newsappears is:

t₃=0.33

The five distributions are used to create the price prediction model.The price prediction model has four classifiers that produce buy, sell,and no-trade signals. There is one classifier C_(class) for each newsclass, i.e., good, bad, mixed, and mention news. A classifier C_(class)produces a buy signal for a news class, if (˜N(μ_(class),σ_(class))≠˜N(μ_(stock), σ_(stock))) and μclass>0), a sell signal if(˜N(μ_(class), σ_(class))≠˜N(μ_(stock), σ_(stock))) and μ_(class)<0),and a no-trade signal otherwise. (˜N(μ_(class), σ_(class))≠˜N(μ_(stock),σ_(stock))) is determined by a statistical hypothesis test that tests ifthe distributions are significantly different [2]. We refer to pagesA-4, A-S, and A-6 in the appendix that describes a significance test todetermine if the distributions are significantly different.

If the distributions are significantly different, then classifierC_(class) will produce a buy signal when μ_(class)>0, and a sell signalwhen μ_(class)<0. If the distributions are not significantly different,or μ_(class)=0, then classifier C_(class) will produce a no-tradesignal.

When the price distribution of the class of news isstatistically-significantly different than the price distribution of thestock in general i.e., (˜N(μ_(class), σ_(class))≠˜N(μ_(stock),σ_(stock))), it implies that μ_(class)≠μ_(stock) above and beyond randomchance. In terms of price movement, it implies that, on average, thechange in price of a stock will be μ_(class) when articles from the newsclass appear, and not μ_(stock). For example, if a stock has moved up onaverage 2% in one day when good news appears, and that, in general thestock has historically moved 0.01% a day, knowing this informationimplies that an investor can improve upon a buy and hold return strategyfor the stock by investing only on the days when good news appear. Ifthis event occurred 5 times in the course of a year, the investor wouldhave an estimated return of 10% . The buy and hold strategy has anestimated return of roughly 2.8%.

In one embodiment of the invention (see FIG. 2A-2E), price changedistributions for Boeing were calculated for the trading days betweenJun. 30, 1999, and Aug. 31, 2001 for every t_(j)−t_(i)=2 business days.In addition, distributions for the 2-day price changes where collectedfor the four news classes good, bad, mixed, and mention. The fivedistributions are specified in the legend of the graph in FIG. 2A, andshown individually relative the 2-day changes in price of Boeing's stockin FIG. 2B-2E.

In FIG. 2B, the distributions of 2-day price changes between Jun. 30,1999 and Aug. 31, 2001 are plotted for Boeing in general (white area)and for days when good news appears (Grey area within white area). Forexample, there were 2 occurrences of Boeing's stock going down −7.5%over a 2-day period when good news was reported. On average, over a2-day period when good news appeared on day ti, the stock went down−1.3% with a s.d. of 3.0%. The stock of Boeing was up an average of 0.1%over a 2-day period independent of the type of news reported. Thestandard deviation for the 2-day price change of the stock in generalwas 3.2% and is listed with the distribution based on good news in thelegend of FIG. 2B. In FIG. 2C the distribution of 2-day price changes isgraphed when bad news was reported. The stock went down an average of-1.9% with a s.d. of 3.3% . In FIG. 2D the distribution for mixed newsappears, and the stock went down an average of −0.8% with a s.d. of 2.8%. In FIG. 2E the distribution of 2-day price changes when the stock ismentioned has an average of −0.7% with a s.d. of 3.6%. Note that thedistributions for mixed and mention news are sparse and they onlycontain a few articles in the sample of articles available during thistime period.

In this embodiment of the invention, a two-sample t test with unequalvariance [2] was used. In this embodiment of the invention, α<0.1 wasused as a threshold, to determine whether there was a significantdifference between the sample distributions of 1 and 2-day price changesfor the stock and the sample distribution of 1 and 2-day price changesfor the stock when news from a particular news class appears. Based onthe 2-day distributions for Boeing are illustrated in FIG. 2A-E. Thefour classifiers that makeup the 2-day prediction model for Boeing aredepicted in the legend of FIG. 2A. The distributions for good news (FIG.2B), and bad news (FIG. 2C) were significantly different than thedistribution of 2-day price changes for the stock in general. Since themean of the good and bad news price distributions are negative, theprediction for their associated classifiers will both be sell signals.The mixed news (FIG. 2D) and mention news (FIG. 2E) distributions werenot significantly different than the distribution of the stock ingeneral, and their classifiers in the 2-day prediction model for Boeingwill produce no-trade signals

In another embodiment of the invention, the 1-day price changedistributions for Boeing for the trading days between Jun. 30, 1999, andAug. 31, 2001 for every t_(j)−t_(i)=1 business day. In addition,distributions for the 1-day price changes were collected for the fournews classes good, bad, mixed, and mention. The distributions of 1-dayprice changes between Jun. 30, 1999 and Aug. 31, 2001 are plotted forBoeing in general (white area) and for days when bad news appears (blackarea within white area). On average, over a 1-day period when bad newsappeared on day ti, the stock went down −1.2% with a s.d. of 2.3%. Thestock of Boeing was up an average of 0.06% over a 1-day period ingeneral. In this embodiment of the invention, bad news gave rise to aclassifier with a sell signal, because the distribution of price changeswhen bad news appeared was statistically-significantly different basedon the t test described above, and the other classes of news gave riseto classifiers producing no-trade signals.

In another embodiment of the invention (see FIG. 3), the same parametersfor the t test were used to determine that a buy signal is predictedwhen stories discussing good news about AT&T appear. This was the casesince the 1-day distribution of price changes for AT&T in general isstatistically-significantly different than the 1-day distribution ofprices changes for days when articles containing good news appears. Thebad, mixed and mention distributions resulted in classifiers thatproduce no-trade signals.

The daily price prediction process (see FIG. 4A), which can be used formaking actual trades, uses the price prediction model for a stock, whichis described above, and the analysts' classifications for news storiesthat appeared between the previous day's market close until the time ofprediction. Each article about the stock during this time period isconsidered the stock's daily financial news. Each article is categorizedby an analyst, and gives rise to one buy, sell, or no-trade signal. Thestock is purchased and held for 1 day (when t_(j)−t_(i)=1) for anembodiment of the invention if the number of buy signals is greater thanthe number of sell and no-trade signals combined. The stock is soldshort and the trade unwound after 1 day (when t_(j)−t_(i)=1) if thenumber of sell signals is greater than the number of buy and no-tradesignals combined.

In general, once the price prediction models are calculated for afinancial instrument, it is straight forward to apply the priceprediction model. The daily news for the financial instrument iscategorized into good, bad, mixed, and mention news. Each articleproduces a trade signal depending its news class and its associatedclassifier in the prediction model. If the number of buy signals exceedsthe number of sell and no trade signals, then the instrument ispurchased and then sold in 1 or 2 days (depending on the number of daysused to gather the distributions). If the number of sell signals exceedsthe number of buy and no-trade signals, the instrument is sold short andthen repurchased in 1 or 2 days. A no-trade decision is made whenneither a buy or sell decision is predicted.

For example, once the prediction models for Boeing are determined (seeFIGS. 2B-2E), the apparatus in FIG. 4B, which is an embodiment of theinvention, can be used to predict future 1 day and 2 day price movementsgiven the stocks prediction model. As depicted in FIGS. 2B-E, Boeingresulted in a prediction model for t_(j)−t_(i)=1 such that a sell signalresults for good news and a no-trade signals resulted for all otherclassifications of news. The prediction model for t_(j)−t_(i)=2 (seeFIG. 2B) such that a sell signal results for good and bad news and ano-trade signals resulted for mixed and mention news. If Boeing had 4good, 1 bad, 1 mixed, and 1 mention news article appear between the timeof trading and the previous market's close, then the prediction would beto sell Boeing and unwind the trade for 1 day, and also (in a separatetrade) to sell Boeing and unwind the trade in 2 days.

One embodiment of this invention assumes that prediction and tradingwill occur a few minutes before the 4 pm stock market close of thecurrent day. It was run for 16 stocks and their price prediction modelswere determined using distributions for 1 and 2-day price changes. Thestock prediction models were based on historical closing prices andfinancial news occurring on the trading days between Jun. 30, 1999, andAug. 31, 2001. The results are presented in FIG. 5. In total, 40 tradeswere predicted for the time period between Sep. 4, 2001 and Sep. 28,2001. The average buy and hold return for this period was −11.37%, andthe average prediction model return, or the resulting return using anembodiment of the invention was 2.82% in the same period. The resultssuggest that using this invention produces a significantly greaterreturn on investment than a buy and hold strategy. The data alsosuggests that for some stocks, there exists a correlation between theprice movement of the stock, and the appearance of good, bad, mixed, andmention news.

Although the invention has been described and illustrated in the contextof stocks, it is to be clearly understood that the same is intended byway of illustration and example only, and is not to be taken by way oflimitation. The spirit and scope of this invention is also applicable tofinancial instruments of any kind that are affected by publiclyavailable news.

1-28. (canceled)
 29. A method of providing financial news items andsignificance of the financial news items, the method comprising thesteps of: receiving financial news items for one or more financialinstruments associated with business entities; classifying the financialnews items into categories that include: a positive category indicatingthat a particular news item is favorable to the financial outlook of theassociated business entity; and a negative category indicating that aparticular news item is unfavorable to the financial outlook of theassociated business entity; sending the financial news items to acomputer of a user for display on the user computer; and sending theclassified categories of the financial news items to the user computerfor display on the user computer near the respective news items.
 30. Themethod according to claim 29, wherein: each category is represented by agraphic symbol; and the step of sending the classified categoriesincludes sending the associated graphic symbols.
 31. The methodaccording to claim 30, wherein: a graphic symbol for the positivecategory includes an arrow pointing upwards; and a graphic symbol forthe negative category includes an arrow pointing downwards.
 32. Themethod according to claim 29, wherein the categories include a mixedcategory indicating that a particular news item contains both afavorable and unfavorable financial outlooks of the associated businessentity.
 33. The method according to claim 29, wherein the categoriesinclude a mention category indicating that a particular news itemmentions the associated business entity.
 34. The method according toclaim 29, wherein the categories include a mention category indicatingthat a particular news item mentions the associated publicly tradedcompany.
 35. A method of providing financial news items and significanceof the financial news items for use in making trading decisions offinancial instruments representing publicly traded companies, the methodcomprising the steps of: receiving financial news items for a pluralityof financial instruments representing a plurality of publicly tradedcompanies; classifying the financial news items into categories thatindicate the impact of the news items on the financial outlook of theassociated publicly traded companies; sending the financial news itemsto a computer of a user for display on the user computer; and sendingthe classified categories of the financial news items to the usercomputer for display on the user computer near the respective news itemsfor analysis by the user in making trading decisions of the financialinstruments.
 36. The method according to claim 35, wherein thecategories include: a positive category indicating that a particularnews item is favorable to the financial outlook of the associatedpublicly traded companies; and a negative category indicating that aparticular news item is unfavorable to the financial outlook of theassociated publicly traded companies.
 37. The method according to claim35, wherein: each category is represented by a graphic symbol; and thestep of sending the classified categories includes sending theassociated graphic symbols.
 38. The method according to claim 37,wherein: a graphic symbol for the positive category includes an arrowpointing upwards; and a graphic symbol for the negative categoryincludes an arrow pointing downwards.
 39. The method according to claim35, wherein the categories include a mixed category indicating that aparticular news item contains both a favorable and unfavorable financialoutlooks of the publicly traded company.
 40. The method according toclaim 35 wherein the categories include a mention category indicatingthat a particular news item mentions the associated publicly tradedcompany.